Beating the Direct Sum Theorem in Communication Complexity with Implications for Sketching

نویسندگان

Marco Molinaro

David P. Woodruff

Grigory Yaroslavtsev

چکیده

A direct sum theorem for two parties and a function f states that the communication cost of solving k copies of f simultaneously with error probability 1/3 is at least k · R1/3(f), where R1/3(f) is the communication required to solve a single copy of f with error probability 1/3. We improve this for a natural family of functions f , showing that the 1-way communication required to solve k copies of f simultaneously with probability 2/3 is Ω(k · R1/k(f)). Since R1/k(f) may be as large as Ω(R1/3(f) · log k), we asymptotically beat the direct sum bound for such functions, showing that the trivial upper bound of solving each of the k copies of f with probability 1 − O(1/k) and taking a union bound is optimal! In order to achieve this, our direct sum involves a novel measure of information cost which allows a protocol to abort with constant probability, and otherwise must be correct with very high probability. Moreover, for the functions considered, we show strong lower bounds on the communication cost of protocols with these relaxed guarantees; indeed, our lower bounds match those for protocols that are not allowed to abort. In the distributed and streaming models, where one wants to be correct not only on a single query, but simultaneously on a sequence of n queries, we obtain optimal lower bounds on the communication or space complexity. Lower bounds obtained from our direct sum result show that a number of techniques in the sketching literature are optimal, including the following: • (JL transform) Lower bound of Ω( 1 2 log n δ ) on the dimension of (oblivious) Johnson-Lindenstrauss transforms. • (`p-estimation) Lower bound for the size of encodings of n vectors in [±M ] that allow `1 or `2-estimation of Ω(n −2 log n δ (log d+ logM)). • (Matrix sketching) Lower bound of Ω( 1 2 log n δ ) on the dimension of a matrix sketch S satisfying the entrywise guarantee |(ASSB)i,j − (AB)i,j | ≤ ‖Ai‖2‖B‖2. • (Database joins) Lower bound of Ω(n 1 2 log n δ logM) for sketching frequency vectors of n tables in a database, each with M records, in order to allow join size estimation. ∗Supported by an IBM PhD fellowship. Work done while visiting IBM Almaden Research Center. †Work done during an internship at IBM Almaden Research Center. Supported by College of Engineering Fellowship at Pennsylvania State University.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lower Bounds for Edit Distance and Product Metrics via Poincaré-Type Inequalities

We prove that any sketching protocol for edit distance achieving a constant approximation requires nearly logarithmic (in the strings’ length) communication complexity. This is an exponential improvement over the previous, doubly-logarithmic, lower bound of [AndoniKrauthgamer, FOCS’07]. Our lower bound also applies to the Ulam distance (edit distance over nonrepetitive strings). In this special...

متن کامل

Improved direct sum theorem in classical communication complexity

For a function f : X ×Y → Z , the m-fold direct sum is the function f : X × Y → Z, defined by f(〈x1, . . . , xm〉, 〈y1, . . . , ym〉) ∆ = 〈f(x1, y1), . . . , f(xm, ym)〉. We show the following direct sum theorem for classical communication protocols, R(f) = Ω( m k R(f)) where R(f) is the the k-round private coins communication complexity of f and R(f) is the k-round public coin complexity of f . I...

متن کامل

Linear Sketching over $\mathbb F_2$

We initiate a systematic study of linear sketching over F2. For a given Boolean function f : {0, 1}n → {0, 1} a randomized F2-sketch is a distributionM over d×nmatrices with elements over F2 such that Mx suffices for computing f(x) with high probability. We study a connection between F2-sketching and a two-player one-way communication game for the corresponding XOR-function. Our results show th...

متن کامل

A Direct Sum Theorem in Communication Complexity via Message Compression

We prove lower bounds for the direct sum problem for two-party bounded error randomised multipleround communication protocols. Our proofs use the notion of information cost of a protocol, as defined by Chakrabarti et al. [CSWY01] and refined further by Bar-Yossef et al. [BJKS02]. Our main technical result is a ‘compression’ theorem saying that, for any probability distribution μ over the inputs...

متن کامل

Distributive lattices with strong endomorphism kernel property as direct sums

Unbounded distributive lattices which have strong endomorphism kernel property (SEKP) introduced by Blyth and Silva in [3] were fully characterized in [11] using Priestley duality (see Theorem 2.8}). We shall determine the structure of special elements (which are introduced after Theorem 2.8 under the name strong elements) and show that these lattices can be considered as a direct product of ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Beating the Direct Sum Theorem in Communication Complexity with Implications for Sketching

نویسندگان

چکیده

منابع مشابه

Lower Bounds for Edit Distance and Product Metrics via Poincaré-Type Inequalities

Improved direct sum theorem in classical communication complexity

Linear Sketching over $\mathbb F_2$

A Direct Sum Theorem in Communication Complexity via Message Compression

Distributive lattices with strong endomorphism kernel property as direct sums

عنوان ژورنال:

اشتراک گذاری